Viewing the proteome from oligopeptides and prediction of protein function.

نویسندگان

  • Hisayuki Horai
  • Kouichi Doi
  • Hirofumi Doi
چکیده

Our research activity of making the lexicon of relatively short oligopeptides has been one of the first steps to view the world of proteome from the perspective of oligopeptides. We propose a new method for the prediction of protein function, especially GeneOntology terms (GO terms), based on statistical characteristics of oligopeptides as an application of the lexicon. In the lexicon, a known function of a protein is inherited to its oligopeptides, and the correspondence between oligopeptides and the function is calculated in the whole proteins. In our method, unknown functions of proteins are predicted by means of the correspondence automatically. We measured the prediction performance using the 28,520 whole human proteins registered in RefSeq for several GO terms by recall-precision graphs. The GO terms include 'membrane', 'nucleus', 'ATP binding', 'hydorolase activity', 'GTP binding', 'intracellular signaling cascade' and 'ubiquitin cycle'. In most cases, it scores 70% recall with 80% precision. The prediction for ATP binding and GTP binding results in quite high performance: it scores 80% recall with 80% precision. Even in the worst case (ubiquitin cycle), it scores 62.6% recall with 80% precision. These results suggest that the proposed method is quite efficient for predicting GO terms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I-49: Human Y Chromosome ProteomeProject

The success of the Human Genome Project (HGP) has provided a blueprint for the approximately 20,000 gene-encoded proteins potentially active in all of the hundreds of cell types that make up the human body. Yet we still have limited knowledge about a majority of the gene-encoded proteins which are the “building blocks of life” and “cellular machinery”. It is estimated that for nearly half of th...

متن کامل

I-3: Human Y Chromosome Proteome Project 2012 Update

The Human Genome Project has generated a blueprint for the approximately 20,300 gene-encoded proteins potentially active in any of 230 cell types that make up the human body (human proteome). However, based on the UniProtKB/Swiss-Prot database content, about 6000 of at the protein level; for many others, there is very little information related to protein function, abundance, subcellular locali...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

Automatic Prediction of Enzyme Activity Based on Oligopeptides

We propose a new method for the prediction of protein function, especially enzyme activity, based on statistical characteristics of oligopeptides. A known function of a protein is regarded to be inherited to its oligopeptides, and the correspondence between oligopeptides and the function is calculated in the whole proteins. In our method, unknown functions of proteins are predicted by means of ...

متن کامل

Evaluation of Leaf Proteome in Wheat Genotypes Under Drought Stress

Drought stress in plants, the change (increase or decrease) in the production of plant proteins. Proteomics in recent years one of the most powerful tools that help us to study the changes in protein In order to investigate the proteome of wheat leaves in response to terminal drought, two genotypes susceptible and resistant wheat genotypes were evaluated under irrigated (non-stress) and rain-fe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genome informatics. International Conference on Genome Informatics

دوره 16 2  شماره 

صفحات  -

تاریخ انتشار 2005